Introduction

In this blog post, I want to show you a graph-based way to split up a class into several independent ones. We take a small example class from Michael Feathers' book "Working effectively with legacy code" and use Neo4j's Awesome Procedures On Cypher (APOC).

Hint: To run the notebook version of this blog post, you need to install the ipython-cypher extension.

class Reservation {

    private int duration;
    private int dailyRate;
    private Date date;
    private Customer customer;
    private List fees = new ArrayList();

    public Reservation(Customer customer, int duration, int dailyRate, Date date) {
        this.customer = customer;
        this.duration = duration;
        this.dailyRate = dailyRate;
        this.date = date;
    }

    public void extend(int additionalDays) {
        duration += additionalDays;
    }

    public void extendForWeek() {
        int weekRemainder = RentalCalendar.weekRemainderFor(date);
        final int DAYS_PER_WEEK = 7;
        extend(weekRemainder);
        dailyRate = RateCalculator.computeWeekly(
                customer.getRateCode()) / DAYS_PER_WEEK;
    }

    public void addFee(FeeRider rider) {
        fees.add(rider);
    }

    int getAdditionalFees() {
        int total = 0;
        for (Iterator it = fees.iterator(); it.hasNext(); ) {
            total += ((FeeRider) (it.next())).getAmount();

        }
        return total;
    }

    int getPrincipalFee() {
        return dailyRate * RateCalculator.rateBase(customer) * duration;
    }

    public int getTotalFee() {
        return getPrincipalFee() + getAdditionalFees();
    }
}

In [1]:
%load_ext cypher

Cleaning existing data


In [2]:
%%cypher
MATCH
    ()-[u:USES]->(),
    (n:NewClass)-[s:SHOULD_DECLARE]->()
DELETE u,s,n


2 nodes deleted.
24 relationship deleted.
Out[2]:

Adding new usage relationship

We want to look at the usage dependencies between methods and fields. In the predefined schema, there is a distinction between reading and writing access to fields for each method. We set up a new relationship to signal just the usage of a field of a particular class by adding a new relationshop USES.


In [3]:
%%cypher
MATCH
    (c:Class {name : "Reservation"}),
    (c)-[:DECLARES]->(m:Method),
    (c)-[:DECLARES]->(f:Field),
    (m)-[:READS|WRITES]->(f)
WHERE NOT (m:Constructor)
MERGE (m)-[u:USES]->(f)
RETURN m.name as method, type(u) as relType, f.name as field


9 relationships created.
Out[3]:
method relType field
getAdditionalFees USES fees
addFee USES fees
getPrincipalFee USES customer
extendForWeek USES customer
getPrincipalFee USES duration
extend USES duration
extend USES duration
extendForWeek USES date
getPrincipalFee USES dailyRate
extendForWeek USES dailyRate

We do the same for the dependency between methods. Here, we can just add the relationship USES based on the existing INVOKE relationship type.


In [4]:
%%cypher
MATCH
    (c:Class {name : "Reservation"}),
    (c)-[:DECLARES]->(m:Method),
    (c)-[:DECLARES]->(m2:Method),
    (m)-[:INVOKES]->(m2:Method)
WHERE NOT (m:Constructor)
MERGE (m)-[u:USES]->(m2)
RETURN m.name as caller, type(u) as relType, m2.name as callee


3 relationships created.
Out[4]:
caller relType callee
extendForWeek USES extend
getTotalFee USES getPrincipalFee
getTotalFee USES getAdditionalFees

Next, we calculate for each usage of a method


In [5]:
%%cypher
MATCH (m:Method)-[u:USES]->()
WITH m, COUNT(u) as weight
SET m.weight = weight
RETURN m.name as method, weight


6 properties set.
Out[5]:
method weight
getPrincipalFee 3
getAdditionalFees 1
getTotalFee 2
extendForWeek 4
addFee 1
extend 1

In [6]:
%%cypher
MATCH (m)-[u:USES]->(m2:Method)
WITH m2, COUNT(u) as weight
SET m2.weight = weight
RETURN m2.name as callee, weight


3 properties set.
Out[6]:
callee weight
getPrincipalFee 1
getAdditionalFees 1
extend 1

Now we have to move the information of the called items to the relationship.


In [7]:
%%cypher
MATCH (caller)-[r:USES]->(callee)
SET r.weight = callee.weight
RETURN count(r)


3 properties set.
Out[7]:
count(r)
12

In [8]:
%%cypher
CALL apoc.algo.community(25,null,'group','USES','OUTGOING','weight',10000)


0 rows affected.
Out[8]:

In [9]:
%%cypher
MATCH (m:Method)-[:USES]->(f:Field)<-[:USES]-(m2:Method)
WHERE m.group <> m2.group 
WITH m.group as newGroupId, m2.group as oldGroupId
MATCH (n:Method) WHERE n.group = oldGroupId
SET n.group = [newGroupId, oldGroupId]
SET n.merged = true
RETURN DISTINCT(n.name), n.group;


0 rows affected.
Out[9]:
(n.name) n.group

In [10]:
%%cypher
MATCH (m:Method)-[:USES]->(:Field)
WHERE NOT EXISTS(m.merged)
WITH m, m.group as groupId
SET m.merged = false
RETURN m.name, m.group;


0 rows affected.
Out[10]:
m.name m.group

In [11]:
%%cypher
MATCH (m:Method)-[:USES]->(f)
MERGE (c:NewClass { name: m.group})
MERGE (c)-[:SHOULD_DECLARE]->(m)
MERGE (c)-[:SHOULD_DECLARE]->(f)
RETURN c.name as newClass, m.name as method, f.name as field


2 nodes created.
2 properties set.
12 relationships created.
2 labels added.
Out[11]:
newClass method field
41 getTotalFee getAdditionalFees
41 getTotalFee getPrincipalFee
41 extend duration
41 extendForWeek extend
41 extendForWeek customer
41 extendForWeek date
41 extendForWeek dailyRate
45 addFee fees
45 getAdditionalFees fees
41 getPrincipalFee duration
41 getPrincipalFee dailyRate
41 getPrincipalFee customer